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(B) STREET: 119 NORTH FOURTH STREET, SUITE 203 
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(D) software: Patentin Release #1.0, Version #1,30 
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(ix) TELECOMMUNICATION INFORMATION: 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5194 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) topology: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CCCAAAAAAG ATAAAATAAA AACAAAACAA AACAAAAGTA CTAACAAATT ATTGAAACTT 60 
TTAA1 I 1 I IA ATAAAGAATC AGTAGATCTA TTGTTAAAAG AAATGAACTC AACTCCAAGT 120 

1 



AAATTATTAC 


CGATAGATAA 


ACATTCTCAT 


■ 1 A A A ■' I 1 A /— 

TTACAATT AC 


a ret — r/~ a a t/~ 
AGCC 1 LAA 1 C 


vi 1 CC 1 CvjvjL-A 


1 RO 


TCAATATTTA 


ATTCCCCAAC 


A A A A f A 1 1 f~ 

AAAACCATTG 


AA 1 1 1 CCCCA 


^~ a a f a a a nr 
GAALAAA 1 1 C 


/- a KrccrKt — r 
LAAoLLuAu 1 




TTAGATCCAA 


ATTCAAGCTC 


TGATACCTAC 


A /~T A fCf A A /~ 

AC 1 AGCGAAC 


AAVjA 1 CAAoA 






GAAGAGAAAA 


AGGACACAGC 


C 1 1 1 CAAACA 


TC 1 1 1 1 GAT A 


VjAAA 1 1 1 1 tiA 


x/ — rrrATA ax 
lt.ll oA 1 AA 1 


jDU 


TCAATCGATA 


— »— a A A X"" AAA y* 

TACAACAAAC 


A A t 1 /~ A A f~ A "T" 

AATTCAACAT 


/— A /~ /~ A A A A /~ 

CAGCAACAAC 


ACjLLACAACA 


ArAATAAfAA 
AC AA AA LAA 




CTCTCACAAA 


CCGACAATAA 


* 1 1 a A l"'r/~ A T 

TTTAATTGAT 


GAA 1 1 1 1 CI 1 


XT/" A A ATA /T* 
1 1 CAAACACC 


OA 1 VjAL 1 1 Uo 


4R0 

HOU 


ACI 1 lAGACC 


T" A A ^* ^ A A f~ f~ A 

TAACCAAGCA 


AAA ~T~ f~ f~ A A / T" 

AAATCCAACT 


GTGGACAAAG 


~rf a a A A A A 

i LiAA I CjAAAA 


1 CA 1 oLALLA 


JtU 


ACTTATATAA 


ATACCTCCCC 


/— a a /- a A A T f~ A 

CAACAAATCA 


a -r- a A T/~ A A A A 

ATAATGAAAA 


AGuCAAC 1 CC 


ta a KC CCTC A 
1 AAAvjv-Vj 1 v,A 




CCTAAAAAAG 


TTGCATTTAC 


TGTAACTAAT 


CCCGAAATTC 


a Tf ai — r a t r~ c 
A 1 CA 1 1 A 1 CC 


AuA 1 AA 1 AuA 


ODU 


GTCGAGGAAG 


A A A ■^H' AAA 

AAGATCAAAG 


T^" A A A A A A A 

TCAACAAAAA 


/" A A /"• A 1 I /~ A /~ 

GAAGATTCAG 


-i — r/~ a r~ c r a /t~ 
TTGAGCCACC 


/ — rTA ATAfA A 
C 1 1 AA 1 ALAA 




CATCAATGGA 


AAGATCCTTC 


TCAATTCAAT 


TATTCTGATG 


A A /~ a t a a a a 

AAGATACAAA 


Tft — i — r/^ a / — i — r 
1 VaC 1 1 CAIj 1 1 


/ ou 


CCACCAACAC 


CACCACTTCA 


—I— a ^* A AAA 

TACGACGAAA 


/— *~-W~ A / 1 — | I 1 /— 

CCTACTTTTG 


f— /— /— is. AT I A 1' T 

CGCAATTATT 


/~AA/~AAAAA/~ 

CaAACAAAAAC 


RAH 


AACGAAGTCA 


ATCTGGAACC 


A A y y A "¥ ¥ y 

AGAGGCATTG 


A A A T* A ~¥~/~" A 

ACAGATATGA 


A A I 1 A A A/™/~^" 

AATTAAAGCG 


/^/~ A A A AT — 1 — T/~ 

CoAAAA 1 1 1 C 


yuu 


AGCAATTTAT 


CATTAGATGA 


A A A A ■■ A A T~ 

AAAAGTCAAT 


~i — 1~ a "i - a ^rt" i l a 

TTATATCTTA 


GTCCCAC 1 AA 


TA ATA A /~ A A T 

1 AA 1 AACAA 1 


you 


AGTAAGAATG 


TGTCAGATAT 


GGATCTGCAT 


1 i a y A A A A f ' 

TTACAAAACT 


TGCAAGACGC 


■ I /~/~ A A A A A /~ 

TTCGAAAAAC 


xuzu 


AAAACTAATG 


AAAATATTCA 


CAAl I ifciCA 


TTTGCI 1 1 AA 


A A X~ A /~ AAA 

AAGCACCAAA 


a A T/~ ATA T — 1~ 

GAATGATATT 


i ncn 


GAAAACCCAT 


TAAACTCATT 


GACTAACGCA 


y* a a i i y 1 r/ i 

GATATTCTGT 


TAAGATCATC 


T/~/~ A T/~ A T/"" A 

TGGATCATCA 


±±H\J 


CAATCGTCAT 


TACAATC 1 I I 


GAGGAATGAC 


AATCGTGTCT 


TGGAATCAGT 


GCCTGGGTCA 


xzuu 


CCTAAGAAGG 


TTAATCCTGG 


Al IGIC1 1 IG 


AATGACGGCA 


T A A A /~/~/~/ 1 1 

TAAAGGGGT T 


/ — r/ — r/~ a T/~ A f 

C 1 C 1 uA 1 VjACj 


XZDU 


GTTGTTGAAT 


CATTACTTCC 


TCGTGACTTA 


TCTCGAGACA 


A A I 1 A A A 

AATTAGAGAC 


t \ f A A A A/~A A 

TACAAAAGAA 




CATGATGCAC 


CAGAACACAA 


CAATGAGAAT 


1 1 IATTGATG 


/ p AAA "T"/*"/~ A /" 

CTAAATCGAC 


T A A T A /~/~ A A T 

TAA 1 ACCAA 1 


IjoU 


AAGGGACAAC 


TCTTAGTATC 


ATCTGATGAT 


CATTTGGACT 


C 1 1 1 1 GAT AG 


a ~r/~ / — r a t A A /~ 

ATCCTATAAC 




CACACTGAAC 


AATCAAl 1 I I 


GAATCTTTTG 


A A -I- A ^ 1 1 A ~f* 

AATAGTGCAT 


/~ A /— A A -I-/ P/ -- A 

CACAATCTCA 


A A 1 1 1 t~~ f~" 1 1 A 

AATTTCGTT A 




AATGCATTGG 


AAAAACAAAG 


GCAAACACAG 


y* A a A A y** A A >™* 

GAACAAGAAC 


A A A y A /■"■ A A /~ 

AAACACAAGC 


GGCAGAGCCT 


xddu 


GAAGAAGAAA 


CTTCG ll I AG 


TGATAATATC 


AAA ^ " I "1 AAA 

AAAGTTAAAC 


A A /~ A /""/"■ /~ A A A 

AAGAGCCAAA 


f A f C K A ~i — i — r/~ 
GAGCAA 1 1 1 <j 


1DZU 


GAGTTTGTCA 


AGGTTACCAT 


CAAGAAAGAA 


CCAG1 IClGG 


A A A AT" 

CCACGGAAAT 


AAA A f~~ f~ f A 

AAAAGCT CCA 


XDoU 


AAAAGAGAAT 


1 1 1 CAAGTCG 


AATATTAAGA 


A ""¥" A A A A A A ™T™y 

ATAAAAAATG 


A A /~ A AAA T 

AAGATGAAAT 


~T/~ f~ f C A A /~/~ A 

1 GCCGAACCA 


X/ *fU 


GCTGATATTC 


ATCCTAAAAA 


AGAAAATGAA 


AAA A ^ * T 

GCAAACAGTC 


ATGTCGAAGA 


t a / — r/- A Tf~ r~ A 

TACTGATGCA 


xouu 


TTGTTGAAGA 


AAGCACTTAA 


TGATGATGAG 


GAATCTGACA 


CGACCCAAAA 


CTCAACGAAA 


I860 


ATGTCAATTC 


Gill 1 CATAT 


TGATAGTGAT 


TGGAAATTGG 


AAGACAGTAA 


TGATGGCGAT 


1920 


AGAGAAGATA 


ATGATGATAT 


TTCTCG 1 1 1 1 


GAGAAATCAG 


ATA 1 1 1 1 GAA 


CGACGTATCA 


1980 


CAGACTTCTG 


ATATTATTGG 


TGACAAATAT 


GGAAACTCAT 


CAAGTGAAAT 


AAGCACCAAA 


2040 



2 



ACATTAGCAC 


CCCCAAGATC 


y y A A A /~ A A T 

GGACAACAAT 


/•— A /~ A A C C A A 

GACAAbbAbA 


AT 1"/ — r A A A"T/~ 

A M C 1 AAA 1 C 


TT XCCA A ft AT 
1 1 1 o^jAAVjM 1 


C X.\J\J 


CCAGCTAATA 


ATGAATCATT 


y y a a y a a y a a 

GCAACAACAA 


i i y y a /~/~~r a y 

TTGGAGG T AC 


/-/—/— AT A /" A A A 

CbCA 1 ACAAA 


AC A AC A ~TC AT 

AbAAbA 1 bA 1 


71 fiO 


AGCA1 1 1 lAG 


CCAACTCGTC 


/— A A T" A 1 1 y / — r* 

CAATATTGCT 


y y a y / — r/~ a a /~ 
CCACC 1 bAAb 


A A T~T/~ A / — 1 — 1 — T 

AA 1 1 IjAC 1 1 1 


bCCCb 1 Avi 1 o 


???0 


GAAGCAAATG 


ATTATTCATC 


1 1 1 IAATGAC 


GTbACCAAAA 


/ — 1 — | — 1 — YC KTCC 

C II M viA 1 UC 


AT At — rCAACC 
A 1 AC 1 LAAuL 


??R0 

£. lOU 


TTTGAAGAGT 


CATTATCTAG 


AGAGCACGAA 


a c~rc a t — r/~ a a 
ACTbAT 1 CAA 


A \rT A ATT A A 

AACCAA 1 1 AA 


TT 1 rATATfA 
1 1 1 CA 1 A 1 CA 




ATTTGGCATA 


A A y A A y* A A A A 

AACAAGAAAA 


/~ /~ A /~ A A /— A A A 

GCAGAAGAAA 


y A -py AAA T — T/~ 

CATCAAA1 1 C 


ata a Arirrr 
A 1 AAAvj 1 1 CC 


A At — TA A ACAfZ 
AAC 1 AAACAb 


7400 


ATCATTGCTA 


X— — I r— A "~ I - y A A y* A 

GTTATCAACA 


A T A ^* A A A A A/~ 

ATACAAAAAC 


/"A A C A ATA AT 

bAACAAbAA 1 


C 1 Lu 1 o M AC 


TA/ — mATAAA 
1 AVj 1 V3A 1 MAM 




GTGAAAATCC 


CAAATGCCAT 


A /~ A A I "r^" A A /*" 

ACAATTCAAb 


A A AT — r/~ A A A/~ 

AAA 1 1 CAAAb 


A CC~Y AAA Ti" — T 

Auu 1 AAA 1 o 1 


C A Tt — rC A AC A 
CA 1 b 1 CAAoA 


? ^70 


AGAGTTGTTA 


>~ ■ y— x** A y A y* A T 

GTCCAGACAT 


GGATGA IMG 


AATG T A 1 C 1 C 


A AT — 1 — 1 — 1 — T A /T" 

AA Mill ACC 


AC A ATT AT/ — T 
AbAA 1 1 A 1 C 1 


7 ^R0 


GAAGACTCTG 


y A Til AAA y A 

GATTTAAAGA 


TTTGAAi 1 1 1 


ccc a a / — r a / — r 
GCCAACTAC 1 


CCAA 1 AACAC 


C A A C A d A CC A 
CAA C Ab A C C A 


7it40 


AGAAGl 1 I lA 


y^^y y a t *r*y a y 

CTCCATTGAG 


y A /" ' r* A A A A A "T~ 

CACTAAAAAT 


/ — i-/ — i — r/ — r r~ r~ a 

GTCTT GTCGA 


a -p ATT C ATA A 

A 1 A M oA 1 AA 


CC ATCt — TA AT 
CbA 1 CC 1 AA 1 


7700 

£ 1 \J\J 


GTTGTTGAAC 


j ■ ■ j— y i A A yy 

CTCCTGAACC 


/—AAA T/~ A "T" A T 

GAAATCATAT 


/— y-ry a A a 1 1 A 

GCTGAAATTA 


C A A AT/"/ — r AC 

CAAA 1 vjC 1 Ab 


A CCt — ITATrA 
ACbb MAI CA 


77fi0 


GCTAATAAGG 


CAGCGCCAAA 


-r s~- a yyy a yy a 

TCAGGCACCA 


y y A T T /*" ff A C 

CCATTGCCAC 


C A /~ A A C C AC A 

CACAACGACA 


A CC A Tt — 1 — rC A 

ACCA 1 C 1 1 CA 


7870 


ACTCGTTCCA 


ATTCAAATAA 


a yy a y ■ i 1 1 y 1 i y y 

ACGAGTGTCC 


AGA 1 1 1 AGAG 


TCCCC AC A T — T 

TGCCCACA 1 1 


TC AAA 1 1 AC A 

1 bAAA 1 1 AbA 


7RRO 


AGAACTTCTT 


CAGCATTAGC 


ACC 1 lb 1 GAC 


A TrT A T" A A ~T~C 

ATGTATAATG 


ATA Mill (j A 


tc a r vTcet — r 
1 bA 1 1 1 Cbb 1 


7Q40 


GCGGGTTCTA 


AACCAACTAT 


AAA yyy A y A A 

AAAGGCAGAA 


y y A A T/* A A A A 

GGAATGAAAA 


/~ A T — rCCC A AC 

CATT GCCAAG 


T AT CC ATA A A 

1 A 1 bbA 1 AAA 


3000 


GATGATGTCA 


AGAGGAl I I I 


GAATGCAAAG 


AAA /— / I - / A 

AAAGGTGTGA 


/ — r/~ a A /~ A T/*~ A 

CTCAAGATGA 


ATATATA A AT 

ATA1 A 1 AAA 1 


jUDU 


GCCAAACTTG 


TTGATCAAAA 


A y A A A A A 

ACCTAAAAAG 


A A 1 1 /■— A A | 1 /~ 

AATTCAATTG 


TCACCGATCC 


CC A AC A CCC A 

CbAAbACCbA 


3 1 7fl 


TATGAAGAAT 


TACAACAAAC 


■fy ■ i /■■■ t a t a 

TGCCTCTATA 


/— a a A T/~ r~ r~ A 

CACAATGCCA 


cc a ~i — rc A t — rc 
CCA 1 1 bA 1 1 C 


A At — TATTTAT 

AAb 1 A 1 1 1 A 1 


31 ao 


GGCCGACCAG 


ACTCCAI 1 IC 


— ^ a yyy A y A T"y* 

TACCGACATG 


MGCCTTATC 


t — r a / — rc A t/— A 
1 1 AC 1 bA 1 bA 


a TTC A A A A A A 

A 1 1 bAAAAAA 


3740 


CCACCTACGG 


CI 1 lATTATC 


TGCTGATCGT 


1 1 1 ATGG 


A A C A A C A A C~t~ 

AACAAbAAb 1 


AC ATCCt — TTA 

ACA 1 CCb 1 1 A 


3 300 


AGATCAAACT 


CTG MM GGT 


— »— ^— a yyy a y y y 

TCACCCAGGG 


GCAGGAGCAG 


c a At — r a at — rc 
CAACTAA 1 1 C 


TTC A AT/ — TTA 

1 1 CAA 1 b II A 


jjDU 


CCAGAGCCAG 


AMI IGAATT 


A A A A 1 1 y A 

AATCAATTCA 


/— — r/~ / — r~ a /~ a A 

CCTGCTAGAA 


a t/ — rc t — rc A a 
ATbTbC 1 bAA 


C A AC At — rrAT 
CAACAb 1 bA 1 


3470 


AATGTCGCCA 


TCAGTGGTAA 


TGCTAGTACT 


ATTAG II M A 


A f~ f~ A A T — rC C A 

ACCAATTbbA 


1 A 1 bAA MM 




GATGACCAAG 


CTACAATTGG 


T""y A A A A A A ^Hy 

TCAAAAAATC 


/— a a /~" A /~ /~ A A /~ 

CAAGAGCAAC 


CTCt — 1 — rC A A A 

C 1 bC 1 1 CAAA 


ATCC CCC A AT 

A 1 CCbCCAA 1 




ACTGTTCGTG 


GTGATGATGA 


TGGATTGGCC 


AGTGCACCTG 


A A A C A CC A A C 

AAACACCAAb 


A A CTCt — r A CC 

AAC 1 CC 1 ACC 


3^00 


AAAAAGGAGT 


CCATATCAAG 


^— A A y yy-'T /— y— 

CAAGCCTGCC 


AAGC 1 1 1 C 1 1 


CTCCCTCCCC 

CTGCC 1 CCCC 


T A C A A A A TC A 

\ AbAAAA 1 CA 


3 DDU 


CCAATTAAGA 


TTGGTTCACC 


AGTTCGAGTT 


A ■ | ' A A /™ AAA A 

ATTAAGAAAA 


a -rrr a ~rc a a t 

ATGGATCAA 1 


TCt — r CCC ATT 

1 bC 1 bbCA 1 1 


3770 


GAACCAATCC 


CAAAAGCCAC 


TCACAAACCG 


AAGAAATCAT 


TCCAAGGAAA 


CGAGATTTCA 


3780 


AACCATAAAG 


TACGAGATGG 


TGGAATTTCA 


CCAAGCTCCG 


GATCAGAGCA 


TCAACAGCAT 


3840 


AATCCTAGTA 


TGGTTTCTGT 


TCCTTCACAG 


TATACTGATG 


CTACTTCAAC 


GGTTCCAGAT 


3900 


GAAAACAAAG 


ATGTTCAACA 


CAAGCCTCGT 


GAAAAGCAAA 


AGCAAAAGCA 


TCACCATCGC 


3960 



3 





CATCATCATC 


ATCATCATAA 


A ^+ A A A A A A V 1 

ACAAAAAACT 


GATATTCCGG 


z — r/ — i — r/ — i — r/*~ A 
G 1 G 1 lul 1 uA 


~[~(~ A TfZ A A ATT 


4070 




CCTGATGTAG 


GATTACAAGA 


ACGAGGCAAA 


TTA 1 1 L 1 1 1 A 


/- a / — r — i — i — r a f~ C 
GAG MM AGG 


A ATT A A^I A AX 
AA 1 1 AAyAA 1 


40R0 




ATCAATTTAC 


CCGATATTAA 


— »— * * ■ 1 A AAA 

TACTCACAAA 


/— /— a A A TT/™ A 

GGAAGATTCA 


/ — 1 — i — r A A ft — TT" 

LI 1 1 AALG 1 1 


GGA 1 AA 1 GGA 


41 40 




GTGCATTGTG 


TTACTACACC 


A >*~ A A "f" A A A 

AGAATACAAC 


ATGGACGACC 


ATA A T/ — ! — TCC 

A 1 AA 1 G M GL 


TATA^TAAA 
L.A 1 Auu 1 AAA 


4700 




GAATTTGAGT 


TGACAGTTGC 


TGATTCATTA 


/— a /~t 1 1 A I 1 1 

GAG 1 1 1 AT T 1 


t a a/ — r~\ — YC A A 
1 AAL 1 1 1 oAA 


VJULM 1 V_A 1 A 1 


4?fi0 




GAAAAACCTC 


GTGGTACATT 


AGTAGAAGTG 


A / — f~/— A A A A f A 

ACTGAAAAGA 


A A t — TT/" — T/~ A A 

AAG 1 lul LAA 


ATfAA^AAAT 
A 1 LAAvaAAA 1 






AGATTGAGTC 


GATTATTTGG 


A AAA A T 

ATCGAAAGAT 


a t — r a t/~ a A 

ATTATCACCA 


rr AC A A a/ — IT 
LGALAAAG I 1 


TCTflCCr A CT 
1 Kj I V3LV-LAL 1 


T JOv 




GAAGTCAAAG 


ATACCTGGGC 


-»- » A T" A A / 1 1 1 

TAATAAG 1 1 1 


GCTCCTGA FG 


/ — 1 — rr- a TTTfr 
G 1 1 LA 1 1 1 GL 


1 AoA lull MV_ 


4440 




ATTGATTTAC 


AACAATTTGA 


a ^* a y y~ AAA ""^ 

AGACCAAATC 


ACCGGTAAAG 


A Tf~ A /" A / — I — T 

LA 1 LALAG 1 1 


T^ATfTfAAT 
1 GA 1 v, 1 LAA 1 


4^00 




TGI 1 1 IAATG 


AATGGGAAAC 


TATGAGTAAT 


GGCAATCAAC 


f A A Tf A A A A C 

CAA 1 oAAAAo 


ArrrAAAffT 
AooLAAALL 1 


4^fi0 

HJuU 




TATAAGATTG 


CTCAATTGGA 


AGTTAAAATG 


I | y ■ A TV 1 1 " ^~ 

TTGTATGTTC 


f A /~/~ A ~P/~ A /~ A 

CACGA 1 LAGA 


1 LLAAbAuAA 


4fi70 


o 


ATATTACCAA 


CCAGCATTAG 


ATCCGCATAT 


/— A A A /~ /~ A ~r~f~ A 

GAAAGCATCA 


A ~T/~ A A T — T AAA 

A 1 GAA I \ AAA 


C A ATrAATA^ 
LAA I GAALAVj 


46R0 




A AT A ATT ACT 


TTGAAGGTTA 


TTTACATCAA 


GAAGGAGGTG 


A f 1 /~T~ f~ f~ A A T 

ATTGTCCAA 1 


TTTTA A/~ A A A 

MM AAGAAA 


4740 




CGI 1 1 1 1 ICA 


AATTAATGGG 


CAC IILlltA 


TTGGCTCATA 


/~-r-/~ AAA T A ~T/~ 

GTGAAATATC 


T/-ATA A A A t — T 

1 CA 1 AAAAL 1 


4ROO 
HOW 




AGAGCCAAAA 


TTAATTTATC 


AAAAGTTGTT 


A ■■■ /** A 111 

GATTTGATTT 


a t"/ — i — rr- a ~i~ a A 

ATGTTGATAA 


A/~ A A A A /"ATT 

AGAAAALA 1 1 


4R^O 


\~ 


GATCGTTCCA 


ATCATCGAAA 


TTTCAGTGAT 


GTGTTATTGT 


TGGATCA 1 GL 


A 1" V (~ A A A ATr 

A 1 1 LAAAA 1 L 


4Q70 




AAATTTGCTA 


ATGGTGAGTT 


GATTGAl I I I 


TGTGCTCCTA 


ATA A A f~ A T/~ A 

ATAAACA 1 GA 


A A TC A A A ATA 

AA 1 GAAAA 1 A 


4Q80 




TGGATTCAAA 


A ail a A A ^ m A 

ATTTACAAGA 


A A "I I A ~1~/~~T~ A T 

AATTATCTAT 


A A A A T f~~ C f~~V 

AGAAA 1 LGG 1 


rr a c a fv — rr* a 

1 LAuALu 1 LA 


A CC A T a C* fTT A 

rtv_V»H 1 UUVJ I AA 


5040 




AATTTGATGC 


TTCAACAACA 


ACAACAACAA 


CAACAACAAC 


AAAGCTCCCA 


ACAGTAATTG 


5100 




AAAGGTCTAC 


1 1 1 IGAI 1 1 1 


TTTAAI 1 1 IA 


ATTGGCAAAT 


ATATGCCCAT 


1 1 1 G 1 ATTAT 


5160 




CTTTTAGTCT 


AATAGCG 1 1 1 


TCI M 1 1 M'C 


CAGT 






5194 




(2) INFORMATION FOR SEQ ID NO: 2: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1664 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asn Ser Thr Pro Ser Lys Leu Leu Pro lie Asp Lys His Ser His 

15 10 15 

Leu Gin Leu Gin Pro Gin Ser ser ser Ala Ser lie Phe Asn ser Pro 
20 25 30 

Thr Lys Pro Leu Asn Phe Pro Arg Thr Asn Ser Lys Pro Ser Leu Asp 



35 



40 



45 



Pro Asn Ser Ser Ser Asp Thr Tyr Thr Ser Glu Gin Asp Gin Glu Lys 
50 55 60 

Glv Lys Glu Glu Lys Lys Asp Thr Ala Phe Gin Thr Ser Phe Asp Arg 
65 70 75 80 

Asn Phe Asp Leu Asp Asn Ser lie Asp lie Gin Gin Thr lie Gin His 
85 90 95 

Gin Gin Gin Gin Pro Gin Gin Gin Gin Gin Leu Ser Gin Thr Asp Asn 
100 105 110 

Asn Leu lie Asp Glu Phe Ser Phe Gin Thr Pro Met Thr Ser Thr Leu 
115 120 125 

Asp Leu Thr Lys Gin Asn Pro Thr val Asp Lys val Asn Glu Asn His 
130 135 140 

Ala Pro Thr Tyr lie Asn Thr Ser Pro Asn Lys Ser lie Met Lys Lys 
145 150 155 160 

Ala Thr Pro Lys Ala Ser Pro Lys Lys val Ala Phe Thr Val Thr Asn 
165 170 175 

Pro Glu lie His His Tyr Pro Asp Asn Arg val Glu Glu Glu Asp Gin 
180 185 190 

ser Gin Gin Lys Glu Asp Ser Val Glu Pro Pro Leu lie Gin His Gin 
195 200 205 

Trp Lys Asp Pro Ser Gin Phe Asn Tyr ser Asp Glu Asp Thr Asn Ala 
210 215 220 

Ser val Pro Pro Thr Pro Pro Leu His Thr Thr Lys Pro Thr Phe Ala 
225 230 235 240 

Gin Leu Leu Asn Lys Asn Asn Glu val Asn Ser Glu Pro Glu Ala Leu 
245 250 255 

Thr Asp Met Lys Leu Lys Arg Glu Asn Phe Ser Asn Leu Ser Leu Asp 
260 265 270 

Glu Lys Val Asn Leu Tyr Leu Ser Pro Thr Asn Asn Asn Asn Ser Lys 
275 280 285 

Asn val ser Asp Met Asp Ser His Leu Gin Asn Leu Gin Asp Ala Ser 
290 295 300 

Lys Asn Lys Thr Asn Glu Asn lie His Asn Leu Ser Phe Ala Leu Lys 
305 310 315 320 

Ala Pro Lys Asn Asp lie Glu Asn Pro Leu Asn Ser Leu Thr Asn Ala 
325 330 335 

Asp lie Ser Leu Arg ser Ser Gly ser Ser Gin Ser Ser Leu Gin Ser 
340 345 350 

Leu Arg Asn Asp Asn Arg val Leu Glu Ser val Pro Gly Ser Pro Lys 
355 360 365 

Lys val Asn Pro Gly Leu Ser Leu Asn Asp Gly lie Lys Gly Phe Ser 
370 375 380 
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Asp Glu val val Glu Ser Leu Leu Pro Arg Asp Leu ser Arg Asp Lys 
385 390 395 400 

Leu Glu Thr Thr Lys Glu His Asp Ala Pro Glu His Asn Asn Glu Asn 
405 410 415 

Phe lie Asp Ala Lys Ser Thr Asn Thr Asn Lys Gly Gin Leu Leu Val 
420 425 430 

Ser Ser Asp Asp His Leu Asp ser Phe Asp Arg Ser Tyr Asn His Thr 
435 440 445 

Glu Gin Ser lie Leu Asn Leu Leu Asn ser Ala ser Gin Ser Gin lie 
450 455 460 

ser Leu Asn Ala Leu Glu Lys Gin Arg Gin Thr Gin Glu Gin Glu Gin 
465 470 475 480 

Thr Gin Ala Ala Glu Pro Glu Glu Glu Thr ser Phe ser Asp Asn lie 
485 490 495 

Lys val Lys Gin Glu Pro Lys Ser Asn Leu Glu Phe val Lys Val Thr 
500 505 510 

lie Lys Lys Glu Pro Val Ser Ala Thr Glu lie Lys Ala Pro Lys Arg 
515 520 525 

Glu Phe Ser ser Arg lie Leu Arg lie Lys Asn Glu Asp Glu lie Ala 
530 535 540 

Glu Pro Ala Asp lie His Pro Lys Lys Glu Asn Glu Ala Asn ser His 
545 550 555 560 

val Glu Asp Thr Asp Ala Leu Leu Lys Lys Ala Leu Asn Asp Asp Glu 
565 570 575 

Glu Ser Asp Thr Thr Gin Asn Ser Thr Lys Met ser lie Arg Phe His 
580 585 590 

lie Asp Ser Asp Trp Lys Leu Glu Asp Ser Asn Asp Gly Asp Arg Glu 
595 600 605 

Asp Asn Asp Asp lie ser Arg Phe Glu Lys ser Asp lie Leu Asn Asp 
610 615 620 

Val Ser Gin Thr Ser Asp lie lie Gly Asp Lys Tyr Gly Asn Ser Ser 
625 630 635 640 

Ser Glu lie Thr Thr Lys Thr Leu Ala Pro Pro Arg Ser Asp Asn Asn 
645 650 655 

Asp Lys Glu Asn Ser Lys Ser Leu Glu Asp Pro Ala Asn Asn Glu Ser 
660 665 670 

Leu Gin Gin Gin Leu Glu Val Pro His Thr Lys Glu Asp Asp Ser lie 
675 680 685 

Leu Ala Asn Ser Ser Asn lie Ala Pro Pro Glu Glu Leu Thr Leu Pro 
690 695 700 

val Val Glu Ala Asn Asp Tyr Ser ser Phe Asn Asp Val Thr Lys Thr 
705 710 715 720 



Phe Asp Ala Tyr Ser Ser Phe Glu Glu Ser Leu ser Arg Glu His Glu 
725 730 735 

Thr Asp Ser Lys Pro lie Asn Phe lie Ser lie Trp His Lys Gin Glu 
740 745 750 

Lys Gin Lys Lys His Gin lie His Lys val Pro Thr Lys Gin lie lie 
755 760 765 

Ala Ser Tyr Gin Gin Tyr Lys Asn Glu Gin Glu Ser Arg Val Thr Ser 
770 775 780 

Asp Lys val Lys lie Pro Asn Ala lie Gin Phe Lys Lys Phe Lys Glu 
785 790 795 800 

val Asn val Met Ser Arg Arg val val ser Pro Asp Met Asp Asp Leu 
805 810 815 

Asn val Ser Gin Phe Leu Pro Glu Leu Ser Glu Asp Ser Gly Phe Lys 
820 825 830 

Asp Leu Asn Phe Ala Asn Tyr Ser Asn Asn Thr Asn Arg Pro Arg Ser 
835 840 845 

Phe Thr Pro Leu Ser Thr Lys Asn val Leu Ser Asn lie Asp Asn Asp 
850 855 860 

Pro Asn val val Glu Pro Pro Glu Pro Lys Ser Tyr Ala Glu lie Arg 
865 870 875 880 

Asn Ala Arg Arg Leu Ser Ala Asn Lys Ala Ala Pro Asn Gin Ala Pro 
885 890 895 

pro Leu Pro Pro Gin Arg Gin Pro ser Ser Thr Arg Ser Asn Ser Asn 
900 905 910 

Lys Arg val Ser Arg Phe Arg val Pro Thr Phe Glu lie Arg Arg Thr 
915 920 925 

Ser Ser Ala Leu Ala Pro Cys Asp Met Tyr Asn Asp lie Phe Asp Asp 
930 935 940 

Phe Gly Ala Gly Ser Lys Pro Thr lie Lys Ala Glu Gly Met Lys Thr 
945 950 955 960 

Leu Pro Ser Met Asp Lys Asp Asp val Lys Arg lie Leu Asn Ala Lys 
965 970 975 

Lys Gly Val Thr Gin Asp Glu Tyr lie Asn Ala Lys Leu val Asp Gin 
980 985 990 

Lys Pro Lys Lys Asn Ser lie val Thr Asp Pro Glu Asp Arg Tyr Glu 
995 1000 1005 

Glu Leu Gin Gin Thr Ala Ser lie His Asn Ala Thr lie Asp Ser Ser 
1010 1015 1020 

lie Tyr Gly Arg Pro Asp Ser lie Ser Thr Asp Met Leu Pro Tyr Leu 
1025 1030 1035 1040 

Ser Asp Glu Leu Lys Lys Pro Pro Thr Ala Leu Leu Ser Ala Asp Arg 
1045 1050 1055 

Leu Phe Met Glu Gin Glu val His Pro Leu Arg Ser Asn Ser Val Leu 
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1060 1065 1070 

Val His Pro Gly Ala Gly Ala Ala Thr Asn Ser Ser Met Leu Pro Glu 
1075 1080 1085 

Pro Asp Phe Glu Leu lie Asn Ser Pro Ala Arg Asn val Ser Asn Asn 
1090 1095 1100 

Ser Asp Asn val Ala lie Ser Gly Asn Ala Ser Thr lie Ser Phe Asn 
1105 1110 1115 1120 

Gin Leu Asp Met Asn Phe Asp Asp Gin Ala Thr lie Gly Gin Lys lie 
1125 1130 1135 

Gin Glu Gin Pro Ala Ser Lys Ser Ala Asn Thr Val Arg Gly Asp Asp 
1140 1145 1150 

Asp Gly Leu Ala Ser Ala Pro Glu Thr Pro Arg Thr Pro Thr Lys Lys 
1155 1160 1165 

Glu Ser lie Ser Ser Lys Pro Ala Lys Leu Ser Ser Ala Ser Pro Arg 
1170 1175 1180 

Lys Ser Pro lie Lys lie Gly Ser Pro val Arg Val lie Lys Lys Asn 
1185 1190 1195 1200 

Gly Ser lie Ala Gly lie Glu Pro lie Pro Lys Ala Thr His Lys Pro 
1205 1210 1215 

Lys Lys Ser Phe Gin Gly Asn Glu lie Ser Asn His Lys val Arg Asp 
1220 1225 1230 

Gly Gly lie Ser Pro Ser Ser Gly Ser Glu His Gin Gin His Asn Pro 
1235 1240 1245 

ser Met val Ser val Pro Ser Gin Tyr Thr Asp Ala Thr Ser Thr val 
1250 1255 1260 

Pro Asp Glu Asn Lys Asp val Gin His Lys Pro Arg Glu Lys Gin Lys 
1265 1270 1275 1280 

Gin Lys His His His Arg His His His His His His Lys Gin Lys Thr 
1285 1290 1295 

Asp lie Pro Gly Val val Asp Asp Glu lie Pro Asp val Gly Leu Gin 
1300 1305 1310 

Glu Arg Gly Lys Leu Phe Phe Arg val Leu Gly lie Lys Asn lie Asn 
1315 1320 1325 

Leu Pro Asp He Asn Thr His Lys Gly Arg Phe Thr Leu Thr Leu Asp 
1330 1335 1340 

Asn Gly val His Cys Val Thr Thr Pro Glu Tyr Asn Met Asp Asp His 
1345 1350 1355 1360 

Asn Val Ala lie Gly Lys Glu Phe Glu Leu Thr val Ala Asp ser Leu 
1365 1370 1375 

Glu Phe lie Leu Thr Leu Lys Ala Ser Tyr Glu Lys Pro Arg Gly Thr 
1380 1385 1390 

Leu Val Glu val Thr Glu Lys Lys Val val Lys Ser Arg Asn Arg Leu 
1395 1400 1405 
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Ser Arg Leu Phe Gly Ser Lys Asp lie lie Thr Thr Thr Lys Phe Val 
1410 1415 1420 

pro Thr Glu Val Lys Asp Thr Trp Ala Asn Lys Phe Ala Pro Asp Gly 
1425 1430 1435 1440 

Ser Phe Ala Arg Cys Tyr lie Asp Leu Gin Gin Phe Glu Asp Gin lie 
1445 1450 1455 

Thr Gly Lys Ala ser Gin Phe Asp Leu Asn Cys Phe Asn Glu Trp Glu 
1460 1465 1470 

Thr Met Ser Asn Gly Asn Gin Pro Met Lys Arg Gly Lys Pro Tyr Lys 
1475 1480 1485 

lie Ala Gin Leu Glu val Lys Met Leu Tyr Val Pro Arg Ser Asp Pro 
1490 1495 1500 

Arq Glu lie Leu Pro Thr Ser lie Arg Ser Ala Tyr Glu Ser lie Asn 
1505 1510 1515 1520 

Glu Leu Asn Asn Glu Gin Asn Asn Tyr Phe Glu Gly Tyr Leu His Gin 
1525 1530 1535 

Glu Gly Gly Asp Cys Pro lie Phe Lys Lys Arg Phe Phe Lys Leu Met 
1540 1545 1550 

Gly Thr Ser Leu Leu Ala His Ser Glu lie Ser His Lys Thr Arg Ala 
1555 1560 1565 

Lys lie Asn Leu ser Lys Val Val Asp Leu lie Tyr val Asp Lys Glu 
1570 1575 1580 

Asn lie Asp Arg Ser Asn His Arg Asn Phe ser Asp val Leu Leu Leu 
1585 1590 1595 1600 

Asp His Ala Phe Lys He Lys Phe Ala Asn Gly Glu Leu lie Asp Phe 
1605 1610 1615 

Cvs Ala Pro Asn Lys His Glu Met Lys lie Trp lie Gin Asn Leu Gin 
1620 1625 1630 

Glu lie lie Tyr Arg Asn Arg Phe Arg Arg Gin Pro Trp val Asn Leu 
1635 1640 1645 

Met Leu Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin ser Ser Gin Gin 
1650 1655 1660 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURE,^ INFO rmation: amino acid positions 218-453 from SEQ ID NO:2 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ser Asp Glu Asp Thr Asn Ala ser val Pro Pro Thr Pro Pro Leu His 
15 10 15 

Thr Thr Lys Pro Thr Phe Ala Gin Leu Leu Asn Lys Asn Asn Glu val 
20 25 30 

Asn Ser Glu Pro Glu Ala Leu Thr Asp Met Lys Leu Lys Arg Glu Asn 
35 40 45 

Phe ser Asn Leu ser Leu Asp Glu Lys val Asn Leu Tyr Leu Ser Pro 
50 55 60 

Thr Asn Asn Asn Asn Ser Lys Asn val Ser Asp Met Asp Ser His Leu 
65 70 75 80 

Gin Asn Leu Gin Asp Ala Ser Lys Asn Lys Thr Asn Glu Asn lie His 
85 90 95 

Asn Leu Ser Phe Ala Leu Lys Ala Pro Lys Asn Asp lie Glu Asn Pro 
100 105 110 

Leu Asn ser Leu Thr Asn Ala Asp lie Ser Leu Arg Ser Ser Gly Ser 
115 120 125 

Ser Gin Ser Ser Leu Gin Ser Leu Arg Asn Asp Asn Arg val Leu Glu 
130 135 140 

Ser val Pro Gly Ser Pro Lys Lys val Asn Pro Gly Leu Ser Leu Asn 
145 150 155 160 

Asp Gly lie Lys Gly Phe ser Asp Glu Val Val Glu Ser Leu Leu Pro 
K 165 170 175 

Arq Asp Leu Ser Arg Asp Lys Leu Glu Thr Thr Lys Glu His Asp Ala 
180 185 190 

□ Pro Glu His Asn Asn Glu Asn Phe lie Asp Ala Lys Ser Thr Asn Thr 
195 200 205 

Asn Lys Gly Gin Leu Leu Val Ser Ser Asp Asp His Leu Asp Ser Phe 
210 215 220 

Asp Arg Ser Tyr Asn His Thr Glu Gin Ser lie Leu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) molecule TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Tyr Leu Ser Pro Thr Asn Asn Asn Asn Ser Lys Asn val Ser Asp Met 
1 5 10 15 
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Asp Leu His Leu Gin Asn Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Asp Trp Lys Leu Glu Asp Ser Asn Asp Gly Asp Arg Glu Asp Asn Asp 
15 10 15 

Asp lie Ser Arg Phe Glu Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Ser Lys Ser Ala Asn Thr Val Arg Gly Asp Asp Asp Gly Leu Ala Ser 
15 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) type: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Asp His Leu Asp Ser Phe Asp Arg Ser Tyr Asn His Thr Glu Gin Ser 
1 5 10 15 
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He 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Trp lie Gin Asn Leu Gin Glu lie lie Tyr Arg Asn Arg Phe Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) topology: linear 

(ii) MOLECULE type: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GAATTCAATG CTACCCTCAA 20 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCCGGGGGAC CCCCTTCACT 20 
(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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Gin 



(C) STRANDEDNESS: single 



12 



t 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AARGTYGGWT TYTTYAAR 
(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) length: 18 base pairs 

(B) type: nucleic acid 
(c) strandedness: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GAAATHGAYG AYTTRATG 
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